• Jobs
  • Site Reliability Engineer (with Java)

Site Reliability Engineer (with Java)

Site Reliability Engineer (with Java)

Technologies: Azure, Kubernetes, Datadog, Python, PostgreSQL, Cloudflare, Ubuntu, Jenkins, GitHub,GitHub Actions, Azure DevOps, and DataDome


Job Description:

  • You have proven experience demonstrating hands-on technical excellence and business impact in combining software engineering skills with systems engineering skills to solve complex automation and reliability challenges
  • You have deep technical experience with various cloud providers, containerization technologies, automated deployment frameworks, orchestration frameworks, monitoring, logging, alerting, system internals, networking, databases, distributed systems, and service-oriented architecture
  • You have the skills to implement load, stress, performance, and reliability testing standards at scale to improve service, platform, and infrastructure resiliency
  • You promote openness, diversity of opinions and inclusive discussions at all times to evaluate a wide variety of ideas and perspectives in solving challenging problems
  • You demonstrate clear decision-making and good trade-offs in complex situations comprising multiple opinions, needs, teams, technologies, cloud providers, and architectural settings
  • You communicate effectively with stakeholders ranging from executives to junior engineers across the breadth and depth of the engineering organization
  • You exemplify high accountability, integrity, and resilience to maintain focus on both big-picture goals and milestones to get there
  • You enable the engineering organization to innovate and deliver with greater speed and safety 
  • Experience with Jenkins, Terraform, SALT, Kubernetes (AKS), Azure, VMWare, Splunk, Docker, Python, Bitbucket
  • Experience with monitoring systems, tracing and observability to manage large scale systems and 24x7 availability.
  • Working with the Applications, Engineering, Platform, Operations and Infrastructure and Cloud teams to ensure we are a premier software delivery organization.
  • Drive software engineering standard processes into all of our operational spaces by leading from the front and by example.
  • Passionate about leading a 24/7 engineering organization and enabling automation
  • Lifelong learner who furthers diversity of thought in their approach to management
  • Willingness to learn eCommerce and/or fulfillment operations and PDL business processes and interdependencies
  • Driven to deliver results on time with high-quality
  • Demonstrated ability to deliver solutions that are easily maintainable, understandable, and diagnosable
  • Able to write clear and consumable documentation
  • Track record of building and running high-performance teams
  • Expertise analyzing complex application, database, network, and OS issues across a distributed large-scale customer-facing system
  • System configuration management experience with automation tools such as Puppet, Chef, Ansible, or Salt
  • Advanced experience with programming and/or scripting languages (Python, Java, bash)
  • Experience with DevOps tools, processes, and culture
  • High-level technical understanding (development methodologies, phases, etc)
  • Proven track record in working collaboratively across functional areas to get results
  • Interact effectively with representatives from Technology and Business Partners